{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Kubeflow Serving (KFServing) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a sample for KFServing SDK. \n",
    "\n",
    "The notebook shows how to use KFServing SDK to create, get, rollout_canary, promote and delete InferenceService."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install kfserving==0.3.0.1 --user"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Restart the kernel to pick up pip installed libraries\n",
    "from IPython.core.display import HTML\n",
    "HTML(\"<script>Jupyter.notebook.kernel.restart()</script>\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "time.sleep(5)\n",
    "\n",
    "from kubernetes import client\n",
    "\n",
    "from kfserving import KFServingClient\n",
    "from kfserving import constants\n",
    "from kfserving import utils\n",
    "from kfserving import V1alpha2EndpointSpec\n",
    "from kfserving import V1alpha2PredictorSpec\n",
    "from kfserving import V1alpha2TensorflowSpec\n",
    "from kfserving import V1alpha2InferenceServiceSpec\n",
    "from kfserving import V1alpha2InferenceService\n",
    "from kubernetes.client import V1ResourceRequirements"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define namespace where InferenceService needs to be deployed to. If not specified, below function defines namespace to the current one where SDK is running in the cluster, otherwise it will deploy to default namespace."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "namespace = utils.get_default_target_namespace()\n",
    "print(namespace)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Allow KFServing Inference Services to be created from the notebook"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Adding this on ClusterRole to skip error: inferenceservices.serving.kubeflow.org is forbidden: User system:serviceaccount:anonymous:default-editor cannot create resource inferenceservices\n",
    "\n",
    "```shell\n",
    "kubectl edit clusterrole kubeflow-kubernetes-edit -n kubeflow\n",
    "```\n",
    "\n",
    "Add following policies to cluster role lists.\n",
    "```yaml\n",
    "- apiGroups:\n",
    "  - kubeflow.org\n",
    "  - serving.kubeflow.org\n",
    "  resources:\n",
    "  - '*'\n",
    "  verbs:\n",
    "  - '*'\n",
    "- apiGroups:\n",
    "  - networking.istio.io\n",
    "  resources:\n",
    "  - '*'\n",
    "  verbs:\n",
    "  - '*'\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define InferenceService"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Firstly define default endpoint spec, and then define the inferenceservice basic on the endpoint spec."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION\n",
    "default_endpoint_spec = V1alpha2EndpointSpec(\n",
    "                          predictor=V1alpha2PredictorSpec(\n",
    "                            min_replicas=1,\n",
    "                            tensorflow=V1alpha2TensorflowSpec(\n",
    "                              storage_uri='gs://kfserving-samples/models/tensorflow/flowers',\n",
    "                              resources=V1ResourceRequirements(\n",
    "                                  requests={'cpu':'100m','memory':'0.5Gi'},\n",
    "                                  limits={'cpu':'100m', 'memory':'0.5Gi'}))))\n",
    "    \n",
    "isvc = V1alpha2InferenceService(api_version=api_version,\n",
    "                          kind=constants.KFSERVING_KIND,\n",
    "                          metadata=client.V1ObjectMeta(\n",
    "                              name='flowers-sample', namespace=namespace),\n",
    "                          spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create InferenceService to Receive 100% Traffic"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Call KFServingClient to create InferenceService."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "kf_serving = KFServingClient()\n",
    "kf_serving.create(isvc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check the Status of InferenceService\n",
    "_Note:  Re-run this until the service shows READY=True_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl get inferenceservices -n $namespace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "kf_serving.get('flowers-sample', \n",
    "               namespace=namespace, \n",
    "               watch=True, \n",
    "               timeout_seconds=120)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n anonymous get inferenceservice flowers-sample -o jsonpath='{.status.url}' | cut -d \"/\" -f 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run a Prediction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "INPUT_PATH=@./input.json\n",
    "\n",
    "MODEL_NAME=flowers-sample\n",
    "echo $MODEL_NAME\n",
    "\n",
    "INGRESS_GATEWAY=kfserving-ingressgateway\n",
    "echo $INGRESS_GATEWAY\n",
    "\n",
    "ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')\n",
    "echo $ISTIO_HOST_IP\n",
    "\n",
    "ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=\"{.spec.ports[?(@.name=='http2')].nodePort}\")\n",
    "echo $ISTIO_NODE_PORT\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "echo $SERVICE_HOSTNAME\n",
    "\n",
    "curl -v -H \"Host: ${SERVICE_HOSTNAME}\"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "INPUT_PATH=@./input.json\n",
    "\n",
    "MODEL_NAME=flowers-sample\n",
    "echo $MODEL_NAME\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "echo $SERVICE_HOSTNAME\n",
    "\n",
    "curl -v -H \"Host: ${SERVICE_HOSTNAME}\" http://${SERVICE_HOSTNAME}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "MODEL_NAME=sklearn-iris\n",
    "echo $MODEL_NAME\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "echo $SERVICE_HOSTNAME\n",
    "\n",
    "curl -v -H \"Host: sklearn-iris.anonymous.example.com\" http://sklearn-iris.anonymous.example.com/v1/models/sklearn-iris:predict -d '{\"instances\":[[6.8,2.8,4.8,1.4],[6.0,3.4,4.5,1.6]]}'\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n anonymous get inferenceservice sklearn-iris -o jsonpath='{.status.url}' | cut -d \"/\" -f 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!kubectl  get gateway kfserving-gateway"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n istio-system get service $INGRESS_GATEWAY"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "MODEL_NAME=flowers-sample\n",
    "INPUT_PATH=@./input.json\n",
    "INGRESS_GATEWAY=kfserving-ingressgateway\n",
    "\n",
    "CLUSTER_IP=$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n",
    "echo \"ClusterIP: $CLUSTER_IP\"\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "echo $SERVICE_HOSTNAME\n",
    "\n",
    "curl -v -H \"Host: ${SERVICE_HOSTNAME}\" http://$CLUSTER_IP/v1/models/$MODEL_NAME:predict -d $INPUT_PATH"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Add a New InferenceService (with a New Model) to Receive 10% Traffic\n",
    "Firstly define canary endpoint spec, and then rollout 10% traffic to the canary version, watch the rollout process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "canary_endpoint_spec = V1alpha2EndpointSpec(\n",
    "                         predictor=V1alpha2PredictorSpec(\n",
    "                           min_replicas=1,\n",
    "                           tensorflow=V1alpha2TensorflowSpec(\n",
    "                             storage_uri='gs://kfserving-samples/models/tensorflow/flowers-2',\n",
    "                             resources=V1ResourceRequirements(\n",
    "                                 requests={'cpu':'100m','memory':'0.5Gi'},\n",
    "                                 limits={'cpu':'100m', 'memory':'0.5Gi'}))))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "kf_serving.rollout_canary('flowers-sample', \n",
    "                          canary=canary_endpoint_spec, \n",
    "                          percent=10,\n",
    "                          namespace=namespace, \n",
    "                          watch=True, \n",
    "                          timeout_seconds=120)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl get inferenceservices -n $namespace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "MODEL_NAME=flowers-sample\n",
    "INPUT_PATH=@./input.json\n",
    "INGRESS_GATEWAY=istio-ingressgateway\n",
    "\n",
    "ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')\n",
    "echo $ISTIO_HOST_IP\n",
    "\n",
    "ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=\"{.spec.ports[?(@.name=='http2')].nodePort}\")\n",
    "echo $ISTIO_NODE_PORT\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "\n",
    "curl -v -H \"Host: ${SERVICE_HOSTNAME}\"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Send More Traffic to the New InferenceService\n",
    "Send 50% traffic to the new model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "kf_serving.rollout_canary('flowers-sample',\n",
    "                          percent=50, \n",
    "                          namespace=namespace,\n",
    "                          watch=True, \n",
    "                          timeout_seconds=120)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl get inferenceservices -n $namespace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "MODEL_NAME=flowers-sample\n",
    "INPUT_PATH=@./input.json\n",
    "INGRESS_GATEWAY=istio-ingressgateway\n",
    "\n",
    "ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')\n",
    "echo $ISTIO_HOST_IP\n",
    "\n",
    "ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=\"{.spec.ports[?(@.name=='http2')].nodePort}\")\n",
    "echo $ISTIO_NODE_PORT\n",
    "\n",
    "SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "\n",
    "curl -v -H \"Host: ${SERVICE_HOSTNAME}\"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (Optional) Delete the InferenceService"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "kf_serving.delete('flowers-sample', \n",
    "                  namespace=namespace)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
